Markov Game Controller Design Algorithms

نویسندگان

  • Rajneesh Sharma
  • M. Gopal
چکیده

Markov games are a generalization of Markov decision process to a multi-agent setting. Two-player zero-sum Markov game framework offers an effective platform for designing robust controllers. This paper presents two novel controller design algorithms that use ideas from game-theory literature to produce reliable controllers that are able to maintain performance in presence of noise and parameter variations. A more widely used approach for controller design is the ∞ H optimal control, which suffers from high computational demand and at times, may be infeasible. Our approach generates an optimal control policy for the agent (controller) via a simple Linear Program enabling the controller to learn about the unknown environment. The controller is facing an unknown environment, and in our formulation this environment corresponds to the behavior rules of the noise modeled as the opponent. Proposed controller architectures attempt to improve controller reliability by a gradual mixing of algorithmic approaches drawn from the game theory literature and the Minimax-Q Markov game solution approach, in a reinforcement-learning framework. We test the proposed algorithms on a simulated Inverted Pendulum Swing-up task and compare its performance against standard Q learning. Keywords—Reinforcement learning, Markov Decision Process, Matrix Games, Markov Games, Smooth Fictitious play, Controller, Inverted Pendulum.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design and Practical Implementation of a New Markov Model Predictive Controller for Variable Communication Packet Loss in Network Control Systems

The current paper investigates the influence of packet losses in network control systems (NCS’s) using the model predictive control (MPC) strategy. The study focuses on two main network packet losses due to sensor to controller and controller to actuator along the communication paths. A new Markov-based method is employed to recursively estimate the probability of time delay in controller to ac...

متن کامل

Hybrid Markov Game Controller Design Algorithms for Nonlinear Systems

Markov games can be effectively used to design controllers for nonlinear systems. The paper presents two novel controller design algorithms by incorporating ideas from gametheory literature that address safety and consistency issues of the ‘learned’ control strategy. A more widely used approach for controller design is the ∞ H optimal control, which suffers from high computational demand and at...

متن کامل

A Markov Game Controller for Finite State Space Nonlinear Systems

The optimal control problem for nonlinear systems, in presence of external disturbances, has been formulated as a two-player zero-sum Markov game between the disturbance and the control. In Reinforcement Learning (RL) paradigm, controller design for nonlinear systems has been framed either using the Markov Decision Process (MDP) setting or the H∞ theory. While MDP framework assumes a stationary...

متن کامل

Hybrid Fuzzy Fictitious Play based Control for Robotic Manipulators

---------------------------------------------------------------------***--------------------------------------------------------------------Abstract In this work, we propose a hybrid controller based on game theory for robotic manipulators. The controller implements the hybrid strategy using fuzzy inference systems on a continuous time two link manipulator. We strive to formulate what may be ca...

متن کامل

Verification of Open Interactive Markov Chains

Interactive Markov chains (IMC) are compositional behavioral models extending both labeled transition systems and continuous-time Markov chains. IMC pair modeling convenience owed to compositionality properties with effective verification algorithms and tools owed to Markov properties. Thus far however, IMC verification did not consider compositionality properties, but considered closed systems...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012